Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Format tables including full width characters properly #919

Conversation

soutaro
Copy link

@soutaro soutaro commented Oct 2, 2015

This PR is to format tables including fullwidth characters properly based on Unicode East_Asian_Width informative property.

Pretty formatter does not print tables including fullwidth characters properly as discusses in #251. It prints as

  Scenario: Test   # features/table.feature:2
    Given A table: # features/step_definitions/table_steps.rb:3
      | Cucumber |
      | キュウリ     |

where we want

  Scenario: Test   # features/table.feature:2
    Given A table: # features/step_definitions/table_steps.rb:3
      | Cucumber |
      | キュウリ  |

This is because the text width is calculated by number of Unicode code points and does not consider character width at all. This patch uses unicode gem to calculate text width if the library is loaded. (I am not very sure this improvement matters to have another dependency.)

The Cucumber.treats_ambiguous_as_fullwidth options is to control the width of ambiguous characters like ø. I don't think it is the best place to have that option.

Use `Unicode.width` to calculate table cell/column width if `unicode` library is loaded.
* Move Unicode related examples to another file
* Run specs only if the interpreter is not JRuby
require 'cucumber/formatter/pretty'
require 'cucumber/cli/options'

unless Cucumber::JRUBY
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this logic? Does the unicode gem not work for JRuby?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattwynne
Copy link
Member

@soutaro can you explain why we need the treats_ambiguous_as_fullwidth option?

@mattwynne mattwynne added the WIP label Apr 19, 2016
@soutaro
Copy link
Author

soutaro commented Apr 20, 2016

The reason is mainly for compatibility with terminal apps. At least, Terminal.app in Mac OS X have an option to control that. (I'm using with the default setting, disabled.)

(I'm not sure how many people are using the option. It may be okay to remove that...)

@@ -30,6 +30,7 @@ Gem::Specification.new do |s|
s.add_development_dependency 'coveralls', '~> 0.7'
s.add_development_dependency 'syntax', '>= 1.0.0'
s.add_development_dependency 'pry'
s.add_development_dependency 'unicode'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering why this is a development dependency. Won't we need it at runtime too? Or is the idea that we'll expect users to have to have installed it themselves if they want to benefit from it?

Copy link
Member

@mattwynne mattwynne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @soutaro,

I'm terribly sorry it's taken so long to give you feedback on this, and I'm really grateful to you for sharing your code with us.

I took the time this week to try and merge your PR, and I realised it was going to take more work than I had time for in that session. So I've written up what I would have done myself as review comments. If you (or anyone else reading!) wants to pick up those review comments and action them, I think this would be ready to merge.

Essentially it boils down to:

  1. Using the global config setting rather than using the Configuration instance which is where all other options are stored.
  2. A bit of duplication between the pretty formatter and the data table's Cells class
  3. A puzzle about whether unicode should be a development dependency or not

@@ -9,6 +9,7 @@
module Cucumber
class << self
attr_accessor :wants_to_quit
attr_accessor :treats_ambiguous_as_fullwidth
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having this option is fine, but we need to add it to https://github.com/cucumber/cucumber-ruby/blob/master/lib/cucumber/configuration.rb instead.

This has implications for how we use it in the DataTable, but we can figure that out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also suggest we rename it to treat_ambiguous_width_characters_as_double_width to be crystal clear.

@@ -692,7 +692,18 @@ def index
end

def width
map{|cell| cell.value ? escape_cell(cell.value.to_s).unpack('U*').length : 0}.max
map {|cell|
if cell.value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's extract a Cell#width method here and call it.

if cell.value
escaped = escape_cell(cell.value.to_s)
if defined?(Unicode)
Unicode.width(escaped, Cucumber.treats_ambiguous_as_fullwidth)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using the global option here, let's accept this specific option as a constructor parameter (using an options hash or keyword argument would be best IMO) into the DataTable and pass it down to the Cell.

Wherever we construct instances of DataTable we'll almost certainly have access to an instance of the configuration where we can read that setting from.

@@ -203,7 +203,8 @@ def table_cell_value(value, status)
status ||= @status || :passed
width = @table.col_width(@col_index)
cell_text = escape_cell(value.to_s || '')
padded = cell_text + (' ' * (width - cell_text.unpack('U*').length))
text_width = defined?(Unicode) ? Unicode.width(cell_text, Cucumber.treats_ambiguous_as_fullwidth) : cell_text.unpack('U*').length
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code looks quite similar to what's in DataTable::Cells#width below. If we move that logic into a Cell#width method, could we re-use that logic here by constructing an instance of a Cell?

@stale
Copy link

stale bot commented Nov 8, 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs.

@stale stale bot added the ⌛ stale Will soon be closed by stalebot unless there is activity label Nov 8, 2017
@stale
Copy link

stale bot commented Nov 15, 2017

This issue has been automatically closed because of inactivity. You can support the Cucumber core team on opencollective.

@stale stale bot closed this Nov 15, 2017
@lock
Copy link

lock bot commented Nov 15, 2018

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Nov 15, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
⌛ stale Will soon be closed by stalebot unless there is activity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants